我们介绍了Amstertime:一个具有挑战性的数据集,可在存在严重的域移位的情况下基准视觉位置识别(VPR)。 Amstertime提供了2500张曲式曲目的图像,这些图像匹配了相同的场景,从街景与来自阿姆斯特丹市的历史档案图像数据相匹配。图像对将同一位置与不同的相机,观点和外观捕获。与现有的基准数据集不同,Amstertime直接在GIS导航平台(Mapillary)中众包。我们评估了各种基准,包括在不同相关数据集上预先培训的非学习,监督和自我监督的方法,以进行验证和检索任务。我们的结果将在地标数据集中预先培训的RESNET-101模型的最佳准确性分别验证和检索任务分别为84%和24%。此外,在分类任务中收集了阿姆斯特丹地标子集以进行特征评估。分类标签进一步用于使用Grad-CAM提取视觉解释,以检查深度度量学习模型中学习的类似视觉效果。
translated by 谷歌翻译
三维(3D)建筑模型在许多现实世界应用中发挥着越来越竞触的作用,同时获得紧凑的建筑物的表现仍然是一个公开的问题。在本文中,我们提出了一种从点云中重建紧凑,水密的多边形建筑模型的新框架。我们的框架包括三个组件:(a)通过自适应空间分区生成一个单元复合物,该分区提供了作为候选集的多面体嵌入; (b)由深度神经网络学习隐式领域,促进建立占用估计; (c)配制马尔可夫随机场,通过组合优化提取建筑物的外表面。我们在形状重建,表面逼近和几何简化中评估和比较我们的最先进方法的方法。综合性和现实世界点云的实验表明,通过我们的神经引导策略,可以获得高质量的建筑模型,在保真度,紧凑性和计算效率方面具有显着的优势。我们的方法显示了对噪声和测量不足的鲁棒性,并且可以从合成扫描到现实世界测量中直接概括。
translated by 谷歌翻译
Investigation and analysis of patient outcomes, including in-hospital mortality and length of stay, are crucial for assisting clinicians in determining a patient's result at the outset of their hospitalization and for assisting hospitals in allocating their resources. This paper proposes an approach based on combining the well-known gray wolf algorithm with frequent items extracted by association rule mining algorithms. First, original features are combined with the discriminative extracted frequent items. The best subset of these features is then chosen, and the parameters of the used classification algorithms are also adjusted, using the gray wolf algorithm. This framework was evaluated using a real dataset made up of 2816 patients from the Imam Ali Kermanshah Hospital in Iran. The study's findings indicate that low Ejection Fraction, old age, high CPK values, and high Creatinine levels are the main contributors to patients' mortality. Several significant and interesting rules related to mortality in hospitals and length of stay have also been extracted and presented. Additionally, the accuracy, sensitivity, specificity, and auroc of the proposed framework for the diagnosis of mortality in the hospital using the SVM classifier were 0.9961, 0.9477, 0.9992, and 0.9734, respectively. According to the framework's findings, adding frequent items as features considerably improves classification accuracy.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
在本文中,我们提出了一种算法,以在动态场景的两对图像之间插值。尽管在过去的几年中,在框架插值方面取得了重大进展,但当前的方法无法处理具有亮度和照明变化的图像,即使很快将图像捕获也很常见。我们建议通过利用现有的光流方法来解决这个问题,这些方法对照明的变化非常健壮。具体而言,使用使用现有预训练的流动网络估算的双向流,我们预测了从中间帧到两个输入图像的流。为此,我们建议将双向流编码为由超网络提供动力的基于坐标的网络,以获得跨时间的连续表示流。一旦获得了估计的流,我们就会在现有的混合网络中使用它们来获得最终的中间帧。通过广泛的实验,我们证明我们的方法能够比最新的框架插值算法产生明显更好的结果。
translated by 谷歌翻译
在这项工作中,我们研究了对象检测模型的自我监督预审计的不同方法。我们首先设计一个通用框架,通过随机采样和投射框来学习从图像中学习空间一致的密集表示,并将其投影到每个增强视图,并最大程度地提高相应的盒子功能之间的相似性。我们研究文献中的现有设计选择,例如盒子生成,功能提取策略,并使用其在实例级图像表示学习技术上获得成功启发的多种视图。我们的结果表明,该方法对超参数的不同选择是可靠的,并且使用多个视图不如实例级图像表示学习所显示的那样有效。我们还设计了两个辅助任务,以通过(1)通过使用对比度损失从采样设置中预测盒子中的一个视图中的框来预测框,并且(2)使用变压器预测盒子坐标,这可能会受益。下游对象检测任务。我们发现,在标记数据上预审计的模型时,这些任务不会导致更好的对象检测性能。
translated by 谷歌翻译
这项研究的目的是开发一个强大的基于深度学习的框架,以区分Covid-19,社区获得的肺炎(CAP)和基于使用各种方案和放射剂量在不同成像中心获得的胸部CT扫描的正常病例和正常情况。我们表明,虽然我们的建议模型是在使用特定扫描协议仅从一个成像中心获取的相对较小的数据集上训练的,但该模型在使用不同技术参数的多个扫描仪获得的异质测试集上表现良好。我们还表明,可以通过无监督的方法来更新模型,以应对火车和测试集之间的数据移动,并在从其他中心接收新的外部数据集时增强模型的鲁棒性。我们采用了合奏体系结构来汇总该模型的多个版本的预测。为了初始培训和开发目的,使用了171 Covid-19、60 CAP和76个正常情况的内部数据集,其中包含使用恒定的标准辐射剂量扫描方案从一个成像中心获得的体积CT扫描。为了评估模型,我们回顾了四个不同的测试集,以研究数据特征对模型性能的转移的影响。在测试用例中,有与火车组相似的CT扫描,以及嘈杂的低剂量和超低剂量CT扫描。此外,从患有心血管疾病或手术病史的患者中获得了一些测试CT扫描。这项研究中使用的整个测试数据集包含51 covid-19、28 CAP和51例正常情况。实验结果表明,我们提出的框架在所有测试集上的表现良好,达到96.15%的总准确度(95%CI:[91.25-98.74]),COVID-119,COVID-96.08%(95%CI:[86.54-99.5],95%),[86.54-99.5],),,),敏感性。帽敏感性为92.86%(95%CI:[76.50-99.19])。
translated by 谷歌翻译
自我监督的代表学习使对比学习的进步推动了显着的跨利赛,这旨在学习嵌入附近积极投入对的转变,同时推动负对的对。虽然可以可靠地生成正对(例如,作为相同图像的不同视图),但是难以准确地建立负对对,定义为来自不同图像的样本,而不管它们的语义内容或视觉功能如何。对比学习中的一个基本问题正在减轻假底片的影响。对比假否定引起了两个代表学习的关键问题:丢弃语义信息和缓慢的收敛。在本文中,我们提出了识别错误否定的新方法,以及减轻其效果的两种策略,即虚假的消极消除和吸引力,同时系统地执行严格的评估,详细阐述了这个问题。我们的方法表现出对基于对比学习的方法的一致性改进。没有标签,我们在想象中的1000个语义课程中识别出具有40%的精度,并且在使用1%标签的FINETUNING时,在先前最先进的最先进的前1个精度的绝对提高5.8%的绝对提高。我们的代码可在https://github.com/gogle-research/fnc上获得。
translated by 谷歌翻译
Fast and easy handheld capture with guideline: closest object moves at most D pixels between views Promote sampled views to local light field via layered scene representation Blend neighboring local light fields to render novel views
translated by 谷歌翻译